The UvA color document dataset

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The IUPR Dataset of Camera-Captured Document Images

Major challenges in camera-base document analysis are dealing with uneven shadows, high degree of curl and perspective distortions. In CBDAR 2007, we introduced the first dataset (DFKI-I) of camera-captured document images in conjunction with a page dewarping contest. One of the main limitations of this dataset is that it contains images only from technical books with simple layouts and moderat...

متن کامل

A Large Benchmark Dataset for Web Document Clustering

Targeting useful and relevant information on the WWW is a topical and highly complicated research area. A thriving research effort that feeds into this area is document clustering, which overlaps closely with areas usually known as text classification and text categorisation. A foundational aspect of such research (which has been proven over and over again in other research disciplines) is the ...

متن کامل

Top-K Color Queries for Document Retrieval

In this paper we describe a new efficient (in fact optimal) data structure for the top-K color problem. Each element of an array A is assigned a color c with priority p(c). For a query range [a, b] and a value K, we have to report K colors with the highest priorities among all colors that occur in A[a..b], sorted in reverse order by their priorities. We show that such queries can be answered in...

متن کامل

Color reduction for complex document images

A new technique for color reduction of complex document images is presented in this article. It reduces significantly the number of colors of the document image (less than 15 colors in most of the cases) so as to have solid characters and uniform local backgrounds. Therefore, this technique can be used as a preprocessing step by text information extraction applications. Specifically, using the ...

متن کامل

Document Processing for Automatic Color Form Dropout

Color dropout refers to the process of converting color form documents to black and white by removing the colors that are part of the blank form and maintaining only the information entered in the form. In this paper, no prior knowledge of the form type is assumed. Color dropout is performed by associating darker non-dropout colors with information that is entered in the form and needs to be pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Document Analysis and Recognition (IJDAR)

سال: 2005

ISSN: 1433-2833,1433-2825

DOI: 10.1007/s10032-004-0135-2